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(57) ABSTRACT 

The invention relates to the insetting of a moving picture in 
a moving main picture when the picture signals are in an 
encoded digital format. The main idea is that the picture 
signals are combined prior to decoding. The frames of the 
picture to be inset are scaled down by reducing the number 
of macro blocks in them in such a manner that the picture 
whole is retained. The macro blocks of a certain area of the 
main picture are replaced by the macro blocks of the reduced 
picture and the combined video signal is decoded. The video 
signals (ESI, ES2) to be combined may be picked up from 
different sources or extracted from a transport stream where 
they are in packets. The system according to the invention 
requires only a single decoder (210), which considerably 
reduces the amount of computation required by the combi- 
nation of the pictures. The advantage is emphasized if there 
are several pictures to be combined. 

13 Claims, 3 Drawing Sheets 
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INSERTING ONE OR MORE VIDEO 
PICTURES BY COMBINING ENCODED 
VIDEO DATA BEFORE DECODING 

5 

BACKGROUND OF THE INVENTION 

1. Technical Field 

The invention relates to a method for adding a moving 
picture on top of a larger moving picture. The method is 
applicable to cases in which the picture signals are com- 10 
pressed digital signals. The invention also relates to an 
arrangement for adding a moving picture on top of a larger 
moving picture. 

2. Discussion of Related Art ^ 
The picture-in -picture (PIP) feature is a widely used 

technique at the transmitting end when making a television 
program, for example. Within the main picture a consider- 
ably smaller picture is temporarily inset which displays, say, 
a simultaneous event that is likely to interest the viewer, 'ilie 20 
addition of a secondary picture to the main picture may also 
occur at the receiving end, controlled by the user of the 
receiver. One or more smaller pictures may e.g. display 
programs running on other channels while the main picture 
is being viewed without interruption. The present invention 25 
relates particularly to the use of the PIP feature at the 
receiving end. 

When compressing a digitized video signal it is customary 
to divide an individual picture frame into blocks which 
typically comprise 8x8 picture elements, or pixels. A square 30 
portion of a frame, comprised of four blocks, is called a 
macro block. Compression, i.e. reduction of the number of 
bits, is realized using intra-frame coding and inter- frame 
coding. The former includes e.g. predictive coding utilizing 
positional redundancy or the use of discrete cosine transform 35 
(OCT) concentrating signal energy of a picture block. The 
numbers produced by the transform are quantized in a 
manner that reduces the quantity of bits and ordered in a 
sequence where the occurrence of strings of consecutive 
zeroes is statistically high. These strings of zeroes are 40 
represented by a number indicating the quantity of the 
zeroes (this is called run length coding, RLC). Other num- 
bers are encoded such that a frequently occurring number is 
represented by fewer bits than a number that occurs less 
frequently (variable length coding, VLC). Inter-frame cod- 45 
ing includes predictive inter-frame coding, motion estima- 
tion comparing the contents of macro blocks at different 
positions, utilizing temporal redundancy and temporal inter- 
polation of reference frames in order to produce the code for 
the frames between them. By combining various coding 50 
methods it is possible to reduce the number of bits trans- 
mitted down to a hundredth part of the original without 
substantially compromising the quality of the picture. At the 
receiving end of the video signal a video decoder performs 
the reverse operations. On the transmission path, several 55 
video signals may travel packet switched in the same 
transport stream (TS) so that the receiver first has to extract 
an individual video signal from it. 

From the prior art is known a PIP method wherein two 
encoded video signals are separately decoded and combined 60 
after the decoding. Prior to combining, one of the video 
pictures is reduced in size. This can be done by selecting e.g. 
every fourth block in both the horizontal and vertical dimen- 
sion of the video signal or by producing by means of 
interpolation a new macro block from each 4x5 macro block 65 
group. At a desired location of the normal-sized picture the 
macro blocks are then replaced by the macro blocks of the 
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reduced picture. In this description and in the claims such a 
reduced picture is called a "mini-picture". The prefix "mini" 
means that the inset picture does not cover the whole main 
picture. FIG. 1 shows in the form of functional block 
diagram such a system according to the prior art. The system 
comprises decoders 110 and 120 as well as a PIP unit 130. 
A video signal ESI (so-called "elementary stream") is 
brought to decoder 110 and video signal ES2 to decoder 120. 
Signals ESI and ES2 are encoded e.g. according to the 
MPEG2 (Motion Picture Experts Group) standard. Decoder 
110 outputs video signal VD1 and decoder 120 video signal 
VD2. The PIP unit 130 comprises a scaling unit 131, selector 
132 and timing unit 133. Signal VD1 is directed straight to 
the selector. Signal VD2 is directed to the scaling unit 13 1 
the output signal VD2 f of which is conducted to the selector. 
The output signal VDO of the selector 132 is either signal 
VD1 or signal VD2' depending on the status of the selection 
signal S output by the timing unit 133. Always when the 
picture-generating system enters the area intended for the 
mini-picture, signal S goes into a state that conducts signal 
VD2' to the output of selector 132. At other times, signal S 
in a state that conducts signal VD1 to the output of selector 
132. The functional blocks shown in FIG. 1 are realized 
partly in software and partly in hardware. 

A disadvantage of the method described above is that it 
requires a double decoding operation. So, when using a 
signal processor, a double decoding capacity is required of 
it, which results in considerable extra costs. Another disad- 
vantage is that in practice, for the reason stated above, only 
one mini-picture may be inset in the main picture. 

SUMMARY OF INVENTION 

An object of the invention is to eliminate the above- 
described disadvantages associated with the prior art. 

In accordance with a first aspect of the invention, a 
method for insetting a moving secondary picture in a mov- 
ing main picture, in which method the signals of said 
pictures are in an encoded digital format and the secondary 
picture is scaled down by reducing the number of macro 
blocks included in each individual frame of the picture and 
the scaled -down frame is inset as a mini-picture in the frame 
of the main picture, is characterized in that said insetting is 
performed prior to the decoding of the picture signals. 

In further accord with the first aspect of the invention, in 
order to inset the frame of the scaled down secondary 
picture, or mini-picture, the code of the macro blocks in a 
certain area of the frame of the main picture are replaced by 
the code of the macro blocks of the frame of the mini- 
picture. 

Still in accord with the first aspect of the invention, the 
reduction of the number of macro blocks in a frame is 
realized by leaving a selected number of macro blocks at 
regular intervals both in the horizontal and in the vertical 
dimension. 

Still further in accord with the first aspect of the invention, 
the reduction of the number of macro blocks in a frame is 
realized by compiling each new macro block from the 
blocks of at least two original macro blocks. 

Further still in accord with the first aspect of the 
invention, the reduction of the number of encoded macro 
blocks in a frame is realized by decoding the video signal to 
be scaled down by including in the decoding from each 
block at least the number that represents the dc component 
of the video signal, producing one new macro block by 
means of interpolation from a predetermined number of 
macro blocks produced, and encoding the signal produced 
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using the same coding method as that used for the signal of FIG. 1 shows a block diagram illustrating the principle of 

the main picture. an arrangement according to the prior art, 

According further to the first aspect of the invention, there piG. 2 shows a block diagram illustrating the principle of 

are at least two secondary pictures to be inset in the main lhc arrangement according to the invention, 

picture. . 5 FIG. 3 shows an example of a PIP image on a screen, 

According still further to the first aspect of the invention, v B 

in which the code of the main picture and the code of the . FIG ; 4fl is a flow diagram showing an example of the 

secondary picture are transmitted in fixed-form packets via operation of the scaling unit of FIG. 2, 

the same transmission path as part of a transport stream, the FIG. 46 shows the area of a mini-picture, 

packets belonging to said pictures are extracted from the 10 FIG 4c is a flow diagram showing an example of the 

transport stream on the basis of identifiers in the headers of operalion of the timing uoit of na 2 and 

the packets and a coherent main picture code and coherent „ T _ _ , ■ . Z . 

secondary picture code are generated and then combined. FIG * 5 shows in the form of block dia S ram an exam P le of 

According to a second aspect of the invention, an arrange- a s y slem tne invention, 

ment for insetting a moving secondary picture in a moving M DETAILED DESCRIPTION OF PREFERRED 

main picture, the signals oi said pictures being in an encoded FMRODIMFNTS 
digital format and the arrangement comprising means for 

scaling down each individual frame in the secondary picture FIG. 1 was already discussed in conjunction with the 

and combining them with the frame of the main picture, also description of the prior art. 

comprises a decoder to decoding the combined frame code F|G 2 sh()WS fa lbe form f funclional block dja , 

into a video signal. ' u n m • i . .■ • • . . , 6 . 

„ « ■ . . . , . PIP implementation according to the invention. It comprises 

Further according to the second aspect of the invention, , p , p unil m and decoder 2m ^ p , p uni , ^ a 

w^^'r 8 7!t " . 6 3 C ° m ; unit 231. 232 and timing unit 233 Video 

bining it with the frame of the main picture comprises a unit . , „ Ci , cc ,- . , - . 6 - , . „ 

for reducing the number of the macro blocks in a frame, a S f m]s ES1 and ^ c ° m P riscd of codes of horizontally 

selector for picture signals, and a timing unit for placing the 25 macro blocks of a picture arc brought to the system, 

mini-picture at a certain location in the main picture. ™ e PIP ™ u 230 can be viewed as a combiner of the signals 

In still further accord with the second aspect of the ^ ^ c c » n f be SeeD FIG * 2 ' il * res P onsive lo the 

invention, the arrangement also comprises means for sl S nals ES1 ' ^ for P r0Vldi ng an output signal CS to the 

extracting the encoded signal of the main picture from the decoder 210. Signal ESI is conducted direct to the selector 

transport stream comprised of data packets, and for extract- 30 232 - Sl g nal ES2 is conducted to the scaling unit 231 in 

ing the encoded signal of the secondary picture from said wnicn tne picture is reduced using a known method. The 

transport stream. output signal ES2' of the scaling unit is conducted to the 

According to a third aspect of the invention, a receiver selector 232. The output signal CS of the selector is either 

comprises a combiner responsive to at least two video signal ESI or signal ES2', depending on the status of the 

signals for providing an output signal, and a decoder, 35 selection signal S output by the timing unit 233. Always 

responsive to said output signal, for providing a decoded when signal ESI contains the code of a macro block meant 

output signal for simultaneously displaying at least two for the area intended originally for the mini-picture on the 

pictures corresponding to said at least two video signals. screen, signal S is in state that directs signal ES2' to the 

The main idea of the invention is that the video signals are output of selector 232. At other times signal S is in state that 
combined prior to the decoding. The encoded macro blocks 40 directs signal ESI to the output of selector 232. Signal S is 
of the area of the main picture intended for the mini-picture generated for the video signals ESI and ES2 from tempo- 
are replaced by macro blocks from another picture. These rally bound synchronization signals SYN, which are used to 
are obtained e.g. by taking macro blocks at regular intervals synchronize the operation of other units, too. Signal CS is 
in such a manner that their total number equals the number conducted to decoder 210 which outputs a complete digital 
of macro blocks that corresponds to the mini-picture area A video si al VD0 Compared t0 the structure of FIG> t this 
single mim-picture macro block may also be produced by stmctUfe has one decoder x which means afl &]mQS{ fif 

assembling it from selected blocks of several original macro ,1™ ,„ t u 0 „„j r _ . J 

ki 1 u.. - , , # . iv. c ■ • 1 percent drop in the need for computing capacity since 

blocks or by interpolating a plurality of ongina macro a-™a*~~ J ■ ~ u ' , r B ; ntn 

Mr . 0 • trt i -.5 * 1 , , t\\ \ a • 1 • decoding requires much more computation than the PIP 

blocks into a single macro block. The combined signal is ^ 4 . A , . . . , p ,. ^„ „ . . 

then decoded function. Applying the principle according to FIG. 2 it is 

An advantageof the invention is that a receiver only needs 50 possible to inset several mim-pictures in the main picture, 

one decoder. In video processing, decoding is the part that ?™ f a ? ' mini *P lcture * * ™eded a separate scaling unit in 

™„; M c tk a „™ a «l j * r ,i_ block 230 and extensions to selector 232 and timing unit 

requires the most computmg. Another advantage of the ~ , , . , . . . 6 * 

• ' ■ . t . . 1 ■ • . t_ * . • iL 233. One common decoder is still enough, which empha- 

invention is that several mini-pictures can be inset in the . , , 

m «;« „,-.u ~ 1 • t * 1 , A sizes the advantage over the prior art. 

main picture with only minimum added computation. A & r 

further advantage of the invention is that the decoder is 55 FIG. 3 shows an example of an image produced by signal 

detached from the rest of the receiver so that the capacity of VDO of FIG - 1 or 2 ' h has a main P icture 31 and an insct 

the transmission system between the decoder and receiver mmi-picture PIP. 

may be dimensioned according to the band of one channel FIG. 4a shows in the form of a flow diagram an example 

only. Yet another advantage of the invention is that if the of tDe operation of the scaling unit 231. A complete frame 

video signals to be combined are brought to the receiver in 60 comprises C macro blocks, or macros, horizontally and R 

the same transport stream, only one complete demultiplexer macros vertically. In the example, every I th macro is selected 

is required which extracts from the transport stream the both horizontally and vertically in the frame of the picture to 

packets relating lo all auxiliary activities as well. De reduced, the selected macros constituting signal ES2' of 

. nnirr nncroiimnM nc toe: nn AiwrxT^ F1G * 2 * Iq ste P 401 Program reception is started. In step 402, 

BRIEF DESCRIPTION OF THE DRAWING 65 lhe processing of u ^ fe Iq ^ ^ 

The invention is below described in detail. Reference will the values of variables r, i, c and j needed in the processing 

be made to the accompanying drawing in which are initialized. Variable r is the number of a macro row in the 
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complete picture, and variable c is the number of a macro additional parameter is introduced in the program, deter- 

column in the complete frame. Variable i is a row number, mining how many columns or rows of the mini-picture 

counting from the row that was last used to select macros for frame produced as described above are included in the final 

a mini-picture. Variable j is a column number, counting from mini-picture. 

the column that was last used to select a macro for a 5 FIG. 5 shows an example of a system in which pictures 

mini-picture. In step 404, row numbers r and i are incre- are combined in accordance with the invention. The system 

mented. In step 405 it is checked whether the last row of the comprises a front end 551 which receives a radio-frequency 

frame was already processed. If so, the processing of the s j gna i and outputs a baseband digital transport stream signal 

next frame is begun (step 402). If the frame is unfinished, it TS. Signal TS comprises consecutive packets comprised of 

is checked according to step 406 whether the row being 10 a header and transport data proper. The header comprises, 

processed is a row on which macros are to be selected. If not, among other things, the packet identity data (PID) or iden- 

the column number c is incremented, step 407. In step 408 tificrs. The transport stream may include packets containing 

it is checked whether the macro row has come to an end. If the Q f several different video signals and, in addition, 

not, the process moves on to the next macro on the row, step packets associated with various auxiliary activities of the 

409, and repeats steps 407 and 408. If the macro row has 15 receiver. Signal TS is conducted to a TS multiplexer 541. 

come to an end, the column number c is reset in step 410 and xh^ ^ a complete demultiplexer, which means it monitors 

the process continues at step 404. If the row is a row on a relatively large amount of PID numbers, and extracts from 

which macros are to be selected, the row number i is reset the transport stream the respective packets. The demulti- 

(step 411) and column numbers c and j are incremented (step' p i exer 541 sends the code Esl of the selected video signal 

412). In step 413 it is checked whether the macro row has 2Q t0 a selector 545 and the data contents of the other extracted 

come to an end. If not, the process moves on to the next pac kets to a host processor 560 in the system. Signal TS is 

macro, step 414. In step 415 it is checked whether the also conducted to a second TS demultiplexer 542. This is a 

column processed is a column on which macros are to be relatively simple demultiplexer which extracts from the 

selected. If not, the process continues at step 412. If it is, the transport stream only the packets associated with a particular 

macro is saved according to step 416. At the same time 25 video signal. Of these the demultiplexer 542 produces video 

column number j is reset. The process then continues at step s j gna i £S2 and,feeds it to a PIP unit 530, to a scaling unit in 

4*2. it. In this example, video signal ES3 from a video disc drive 

FIG. 4b shows the area of a mini-picture. It comprises 552 is also brought to selector 545. Signal ES3 is encoded 

vertically R' macro block areas and horizontally C macro in the same manner as signals ESI and ES2. Selector 545 

block areas. Corresponding to the markings of FIG. 4a, the 30 outputs the main picture signal ESs, which is either ESI or 

number of rows R' equals the ratio R/I rounded off to the ES3, depending on a control issued by the host processor 

nearest smaller integer, and the number of columns C equals 560. Signal ESs is conducted to the PIP unit 530, which 

the ratio C/I rounded off to the nearest smaller integer. The corresponds to the PIP unit 230 in FIG. 2. Signal ES3 is also 

mini-picture starts vertically from row rl of the complete conducted direct to the PIP unit 530, to a scaling unit in it. 

frame and horizontally from column cl of the complete 35 Unit 530 reduces the number of macro blocks of signal ES2, 

frame, signal ES3 or both, depending on a control issued by the 

FIG. 4c shows in the form of flow diagram an example of processor 560. Furthermore, unit 530 substitutes macro 

the operation of the timing unit 233. A logic clement or blocks of the scaled-down pictures for a certain portion of 

program corresponding to the diagram has at its disposal the the macro blocks of signal ESs corresponding to the main 

values of variables r and c produced by the operation 40 picture. Thus in this example it is possible to inset one or two 

according to FIG. 4a. In step 421 the value of the row pictures in the main picture. The output signal MP of unit 

variable r and the value of the column variable c are read. In 530 is conducted to a common decoder 510 which outputs 

step 422 it is checked whether the current position in the a complete digital video signal VDO. It is used to generate 

complete frame is on a row that belongs to the mini-picture the analog or digital signals controlling the display. The 

area. If not, selection signal S is set to zero (step 424), which 45 system also comprises a unit 561 to receive selection and 

state corresponds in FIG. 2 to the selection of signal ESI by control data from the user. The host processor 560 is 

selector 232. If the current row falls within the mini-picture connected with the other units via a bus 562. 

area, it is checked according to step 423 whether the current l n the foregoing embodiments according to the invention 

position in the complete frame is in a column that belongs were described. The invention is not limited to these 

to the mini-picture area. If not, the process moves on to step 50 embodiments. For example, in the scaling of the picture 

424. If the current column falls within the mini-picture area, inset in the main picture it is possible to use interpolation 

selection signal S is set to one (step 425), which state a iso in the case of a common decoder. In that case, the 

corresponds in FIG. 2 to the selection of signal ES2* by picture to be reduced is decoded in a simple manner e.g. by 

selector 232, The setting of signal S is realized in a syn- selecting from the numbers produced by the DCT only the 

chronized manner between two consecutive macro block 55 numbers representing the dc components of each block, 

times. After the setting, the values of variables r and c are Scaling is then performed using interpolation, followed by 

read again. Operation corresponding to FIG. 4c may also be new encoding prior to the combination of the pictures. The 

realized such that the comparison corresponding to step 422 extraction of different video signals from the transport 

is made after the value of variable r has been incremented, stream may also be realized using a single demultiplexer 

and the comparison corresponding to step 423 is made after 60 instead of two or more. -The invention may be applied in 

the value of variable c has been incremented. different ways within the scope defined by the appended 

In the operation according to FIGS. 4a, 4b and 4c the claims, 

position of the mini-picture to be inset in the main picture is What is claimed is: 

determined by means of parameters rl, cl. The size of the 1. A method for insetting a moving secondary picture in 

mini-picture is determined by parameter I. The width-lo- 65 a moving main picture, in which method the signals of said 

height ratio of the mini-picture is thus the same as that of the pictures are encoded in frames in a digital format, the 

main picture. The ratio can be made freely selectable if an method comprising the steps of: 
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scaling down the secondary picture by reducing a number 

of macro blocks included in each individual frame of 

the secondary picture and 
insetting the scaled-down frame as a mini -picture in the 

frame of the main picture, 5 
wherein said insetting is performed prior to decoding said 

signals of said pictures. 

2. The method according to claim 1, wherein in the step 
of insetting the scaled-down frame, a code of the macro 
blocks in a certain area of the frame of the main picture is 30 
replaced by a code of the macro blocks of the frame of the 
mini-picture. 

3. The method according to claim 1, wherein the step of 
scaling down the secondary picture is realized by leaving a 
selected number of macro blocks at regular intervals both in 35 
a horizontal and in a vertical dimension. 

4. The method according to claim 1, wherein the step of 
scaling down the secondary picture is realized by compiling 
a new macro block from blocks of at least two original 
macro blocks. 20 

5. The method according to claim 1, wherein said reduc- 
tion of the number of encoded macro blocks in a frame is 
realized by 

decoding the video signal by including in the decoding 25 

from each block at least a number that represents a dc 

component of the video signal, 
producing one new macro block by interpolating from a 

predetermined number of macro blocks produced, and 
encoding a signal produced by said step of producing 30 

using a same coding method that is used for the signal 

of the main picture. 

6. The method according to claim 1, wherein in the steps 
of scaling down and insetting, there are at least two second- 
ary pictures for said scaling down and said insetting in the 35 
main picture. 

7. The method according to claim 1, in which in order to 
perform the steps of scaling down and insetting, code of the 
main picture and code of the secondary picture are trans- 
mitted in packets via a same transmission path as part of a 40 
transport stream, wherein the packets belonging to said 
pictures are extracted from the transport stream according to 
identifiers in headers of the packets, and a main picture code 
and secondary picture code arc generated and then com- 
bined. 45 

8. An arrangement for insetting a moving secondary 
picture in a moving main picture, wherein signals of said 
pictures are in an encoded digital format, the arrangement 
comprising: 
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means (230) for scaling down an individual frame in the 
secondary picture by reducing a number of macro 
blocks included in each individual frame of the sec- 
ondary picture and combining a resulting scaled-down 
frame with a frame of the main picture for providing 
combined frame code, and 

a decoder (210) for decoding the combined frame code 
into a video signal. 

9. The arrangement according to claim 8, wherein the 
means for scaling down an individual frame and combining 
it with the frame of the main picture comprises a unit (231) 
for reducing a number of the macro blocks in a frame, a 
selector (232) for picture signals, and a timing unit (233) for 
placing the scaled -down frame as a mini-picture at a certain 
location in the main picture. 

10. The arrangement according to claim 8, further com- 
prising means (541, 542) for extracting the signal of the 
main picture from a transport stream comprising data 
packets, and for extracting the signal of the secondary 
picture from said transport steam. 

11. A receiver comprising: 

a combiner (230) for insetting a moving secondary picture 
in a moving main picture using scaling down of the 
secondary picture by reducing a number of macro 
blocks included in each individual frame of the sec- 
ondary picture, responsive to at least two signals in a 
digital format for providing an output signal (CS), 
wherein said at least two signals represent the main 
picture and the at least one secondary picture, 
respectively, and 

a decoder, responsive to said output signal, for providing 
a decoded output signal for simultaneously displaying 
at least two pictures corresponding to said at least two 
video signals. 

12. The method according to claim 2, in which in order to 
perform the steps of scaling down and insetting, the code of 
the main picture and the code of the secondary picture are 
transmitted in fixed-form packets via a same transmission 
path as part of a transport stream, wherein the packets are 
extracted from the transport stream according to identifiers 
in headers of the packets, for generating and then combining 
a coherent main picture code and a coherent secondary 
picture code. 

13. The arrangement according to claim 9, wherein it also 
comprises means (541, 542) for extracting the encoded 
signal of the main picture from the transport stream com- 
prised of data packets, and for extracting the encoded signal 
of the secondary picture from said transport stream. 

# * * * * 
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